On the use of orthogonal GMM in speaker recognition
نویسندگان
چکیده
The Gaussian mixture modeling (GMM) techniques are increasingly being used for both speaker identification and verification. Most of these models assume diagonal covariance matrices. Although empirically any distribution can be approximated with a diagonal GMM, a large number of mixture components are usually needed to obtain a good approximation. A consequence of using a large GMM is that its training is time consuming and its response speed is very slow. This paper proposes a modification to the standard diagonal GMM approach. The proposed scheme includes an orthogonal transformation: feature vectors are first transformed to the space spanned by the eigenvectors of the covariance matrix before applying to the diagonal GMM. Only a small computational load is introduced by this transformation, but results from both speaker identification and verification experiments indicated that the orthogonal transformation considerably improves the recognition performance. For a specific performance level, the GMM with orthogonal transform needs only onefourth the number of Gaussian functions required by the standard GMM.
منابع مشابه
An orthogonal GMM based speaker verification system
This paper describes a new speaker verification system based on orthogonal Gaussian mixture modeling (GMM) techniques combined with maximum a posteriori (MAP) adaptation. In most of the GMM based speaker verification systems, the variance of each component is constrained to be diagonal for its computational simplicity. However, this approximation inevitably introduces performance degradation. T...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملEfficient Training of GMM Based Speaker Recognition System
Automatic speaker recognition (ASR) is based on speech feature vectors, models, and classifiers. To improve the speaker recognition performance, we must affect at least one of these modules. In this paper we propose to use subband spectral centroids (SSCs) as a complementary features with the traditional MFCC features, and a new GMM training algorithm, with the ultimate goal to search the bette...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملA Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کامل